Imagine watching your favourite French movie and feeling disconnected just because the English dubbing didn't match the actors' lip movements.
The problem isn’t just limited to you. In the world of global cinema, audiences increasingly crave a diverse range of content, transcending language barriers.
However, a significant challenge arises when films are dubbed into different languages. The issue lies in the audio-visual disconnect: the actors' lip movements on the screen do not match the dubbed audio.
This mismatch can be jarring and often diminishes the viewing experience.
For instance, watching a French film dubbed in Tamil can be disconcerting as the actors' lip movements still align with Spanish, creating a visual dissonance that detracts from the immersive experience of the film.
The problem extends beyond just cinema; it affects any video content that undergoes dubbing for a multilingual audience. The lack of synchronisation between audio and visual elements can lead to a loss of authenticity and engagement, making it challenging for viewers to connect with the content fully. The issue is particularly pronounced in a country like India, where the diversity of languages and the popularity of regional cinema make multilingual content a necessity.
This is the exact problem Anjan Banerjee faced while watching the Korean movie Train to Busan, dubbed in English. He experienced a disconnection with the movie as the dubbed audio did not synchronise with the facial movements of the actors.
Anjan realised the problem wasn’t limited to him and thought of solving it for Indians, who increasingly consume diverse content, including Spanish, French, Turkish, and Korean.
He worked as a researcher and co-founder of VisageMap, a company that used to offer personal identification solutions, which was later acquired by FaceFirst.
Subsequently, He joined FaceFirst as a Senior Research Scientist. While working, he began studying the potential of AI along with his batchmates from IIT Kanpur, Subhabrata Debnath and Subhashish Saha.
He quickly realised the potential of AI and began building the VisualDub technology to address audio-visual dissonance.
During this time, the trio reached out to Mandar Natekar, a media and entertainment veteran who worked for entertainment giants, for advice and mentorship, which paved the way for the birth of NeuralGarage.
Mandar Natekar was the CMO of Bigadda.com. He was also the Revenue Head of BCCL and has been working in a senior leadership position in Viacom 18 Media. Later, in 2019, Before co-founding NeuralGarage, He became the chief business officer of KidZania India.
In July 2021, a year after COVID came out, they founded the deep tech startup NeuralGarage with Mandar Natekar to break all barriers to language visually by leveraging the power of artificial intelligence (AI).
According to founders, The name ‘NeuralGarage’ represents neural networks—the heart of AI—and pays tribute to tech companies Apple, Microsoft, Google, and Meta, which were, as the legend goes, started from a garage.
NeuralGarage has built a flagship product, VisualDub, which reduces the audio-visual disparity in dubbed content by syncing the lip and jaw movements of actors with the audio.
How it works?
VisualDub is a sophisticated technology that works like a smart translator between sounds and lip movements.
Imagine every sound we make (like 'ah' or 'ee') has a matching lip shape. VisualDub’s proprietary technology fixes this problem by utilising generative AI to ensure the movie's actors' lip movements match the dubbed words perfectly.
The AI is like a skilled artist who can subtly change the actors' faces – their lips, jaw, and even smile lines – to match the sounds of the spoken words, making the movie look more natural and realistic as if the actors were really speaking the dubbed language.
Natekar, a part of the team behind VisualDub, explains that dubbed movies feel more natural and local by fixing the mismatch between sound and lip movement. This helps viewers connect better with the movie. Integrating this technology into movies is pretty straightforward. It's like adding an extra layer to the already dubbed movie, without changing the original dubbing work.
VisualDub is offered in different ways – through an API (which is like a building block for software), as a SaaS (software that you can access and use online), and as a desktop software that you can install on your computer.
VisualDub, which is powered by complex AI and computer vision algorithms, has been tested in over 30 languages, including many from India and other international languages like Italian, German, Spanish, Japanese, Korean, and Mandarin.
Funding & Investors
In November 2022, NeuralGarage raised $1.45 million in a seed round led by Exfinity Ventures. The startup is also backed by angel investors RAAY Global (Amit Patni Family Office), Vishal Agarwal and Raj Kulasingam (iconic global investor duo V&R), Anand Singh (Elios & Nexus Global Fund), Sarath Sura (Sunn91 Ventures), Sachin Jain, Narendra Soni (ex-KPMG) and Kejal Shah (ex Avendus PE).
“With the recent advancements in AI, it is imperative that it will play a crucial role in the forthcoming years by enhancing and inspiring creation, delivery, and consumption of content across the horizon. NeuralGarage is committed to contribute to this space and establish a stronger connection between the creator and the consumer,” Subhabrata Debnath, Co-Founder & CTO of NeuralGarage, said earlier.
Partnerships
In June 2023, NeuralGarage announced its partnership with Amazon India for their ad campaigns starring the renowned actor Manoj Bajpayee.
Initially filmed in Hindi, the campaign took a creative leap by being dubbed into multiple Indian languages, including Tamil, Telugu, Kannada, Malayalam, Bengali, Gujarati, and Marathi.
According to the startup, the technique created an illusion as if the advertisement had been originally shot in each of these languages, eventually building a powerful and authentic connection with consumers across different regions and breaking the language barrier in advertising.
Besides Amazon India, It has worked with technology giant Microsoft, Hippo Video, and Pixis.
Subhabrata Debnath, Co-founder and CTO at NeuralGarage, stated, “The aim behind VisualDub has been to minimise visual dissonance in content and media. We are committed to continuous innovation, pushing the boundaries of generative AI and extending its reach into various industries. We truly believe that our work would allow the seamless creation of multilingual lip-synced content that builds a stronger and closer bond between the content creators and the end-consumers.”
Future of Generative AI in India
As we look towards the future of the global film and media production industry, it's evident that the demand for multilingual and culturally resonant content is on a significant rise.
The trend is not just a fleeting one because it reflects a deeply interconnected world where audiences crave authentic experiences in entertainment, regardless of language barriers.
Startups like NeuralGarage are leveraging AI-driven technology, VisualDub, to address this critical gap in the dubbing process, ensuring that the lip movements of actors align seamlessly with the dubbed audio.
According to Statista, The generative AI market is expected to grow substantially, with a forecasted Compound Annual Growth Rate (CAGR) of over 24.4% from 2023 to 2030.
As of May 2023, the Indian Generative AI landscape includes over 60 startups that are working on providing solutions and services across various industrial verticals. Notably, these startups collectively raised over $590 million in funding, with 2022 witnessing the most significant inflow of funds, the NASCOMM report revealed.
Join our new WhatsApp Channel for the latest startup news updates